智能论文笔记

SparCL: Sparse Continual Learning on the Edge

Zifeng Wang , Zheng Zhan , Yifan Gong , Geng Yuan , Wei Niu , Tong Jian , Bin Ren , Stratis Ioannidis , Yanzhi Wang , Jennifer Dy

分类：机器学习 | 人工智能 | 计算机视觉

2022-09-20

持续学习的现有工作（CL）的重点是减轻灾难性遗忘，即学习新任务时过去任务的模型绩效恶化。但是，CL系统的训练效率不足，这限制了CL系统在资源有限的方案下的现实应用。在这项工作中，我们提出了一个名为“稀疏持续学习”（SPARCL）的新颖框架，这是第一个利用稀疏性以使边缘设备上具有成本效益的持续学习的研究。 SPARCL通过三个方面的协同作用来实现训练加速度和准确性保护：体重稀疏性，数据效率和梯度稀疏性。具体而言，我们建议在整个CL过程中学习一个稀疏网络，动态数据删除（DDR），以删除信息较少的培训数据和动态梯度掩盖（DGM），以稀疏梯度更新。他们每个人不仅提高了效率，而且进一步减轻了灾难性的遗忘。 SPARCL始终提高现有最新CL方法（SOTA）CL方法的训练效率最多减少了训练失败，而且令人惊讶的是，SOTA的准确性最多最多提高了1.7％。 SPARCL还优于通过将SOTA稀疏训练方法适应CL设置的效率和准确性获得的竞争基线。我们还评估了SPARCL在真实手机上的有效性，进一步表明了我们方法的实际潜力。

translated by 谷歌翻译

Q-ViT: Fully Differentiable Quantization for Vision Transformer

Zhexin Li , Tong Yang , Peisong Wang , Jian Cheng

分类：计算机视觉

2022-01-19

在本文中，我们提出了一种称为Q-Vit的视觉变压器（VIT）的完全可区分的量化方法，其中两个量化标度和位宽度都是可学习的参数。具体而言，根据我们的观察，即VIT显示出不同的量化鲁棒性，我们利用头部宽度的位宽度来挤压Q-Vit的大小，同时保持性能。此外，我们提出了一种名为“可切换量表”的新技术，以解决量级和位宽度的联合训练中的收敛问题。这样，Q-Vit将VIT量化的限制推向了3位，而不会降低性能。此外，我们分析了VIT的每个体系结构成分的量化鲁棒性，并表明多头自我注意力（MSA）和高斯误差线性单元（GELU）是VIT量化的关键方面。这项研究提供了一些有关VIT量化的进一步研究的见解。在不同的VIT模型（例如DEIT和SWIN Transformer）上进行的广泛实验显示了我们量化方法的有效性。特别是，我们的方法优于最先进的统一量化方法，而Deit微型的量化方法则优于1.5％。

translated by 谷歌翻译

Deep Learning on Multimodal Sensor Data at the Wireless Edge for Vehicular Network

Batool Salehi , Guillem Reus-Muns , Debashri Roy , Zifeng Wang , Tong Jian , Jennifer Dy , Stratis Ioannidis , Kaushik Chowdhury

分类：机器学习

2022-01-12

在车辆场景中的毫米波链路的光束选择是一个具有挑战性的问题，因为所有候选光束对之间的详尽搜索都不能在短接触时间内被确认完成。我们通过利用像LIDAR，相机图像和GPS等传感器收集的多模级数据来解决这一问题。我们提出了可以在本地以及移动边缘计算中心（MEC）本地执行的个人方式和分布式融合的深度学习（F-DL）架构，并研究相关权衡。我们还制定和解决优化问题，以考虑实际的光束搜索，MEC处理和传感器到MEC数据传送延迟开销，用于确定上述F-DL架构的输出尺寸。在公开的合成和本土现实世界数据集上进行的广泛评估结果分别在古典RF光束上释放出95％和96％的束选择速度提高。在预测前10个最佳光束对中，F-DL还优于最先进的技术20-22％。

translated by 谷歌翻译

LGD: Label-guided Self-distillation for Object Detection

Peizhen Zhang , Zijian Kang , Tong Yang , Xiangyu Zhang , Nanning Zheng , Jian Sun

分类：计算机视觉

2021-09-23

在本文中，我们提出了一种用于一般物体检测的第一自蒸馏框架，称为LGD（标签引导自蒸馏）。以前的研究依赖于强大的预酝酿教师，以提供在现实世界方案中可能无法使用的指导知识。相反，我们通过对象之间的关系间和帧间关系建模来生成一个有效的知识，只需要学生表示和常规标签。具体而言，我们的框架涉及稀疏的标签外观编码，对象间关系适应和对象内的知识映射，以获得指导知识。他们在培训阶段共同形成隐式教师，动态依赖标签和不断发展的学生表示。 LGD中的模块与学生检测器的端到端训练，并在推理中丢弃。实验上，LGD在各种探测器，数据集和广泛的任务上获得了体面的结果，如实例分段。例如，在MS-Coco DataSet中，LGD将Reset-50下的REDINENT改善2倍单尺度培训，从36.2％到39.0％地图（+ 2.8％）。它在2倍多尺度培训下使用Resnext-101 DCN V2等FCO的探测器增加了更强大的探测器，从46.1％到47.9％（+ 1.8％）。与古典教师的方法FGFI相比，LGD不仅在不需要佩金的教师而且还可以降低固有的学生学习超出51％的培训成本。

translated by 谷歌翻译

Anchor DETR: Query Design for Transformer-Based Object Detection

Yingming Wang , Xiangyu Zhang , Tong Yang , Jian Sun

分类：计算机视觉

2021-09-15

在本文中，我们提出了一种用于基于变压器的对象检测的新型查询设计。在以前的基于变压器的检测器中，对象查询是一组学习的嵌入。但是，每个学习的嵌入都没有明确的物理意义，我们无法解释它将在哪里关注。由于每个对象查询的预测时隙没有特定模式，难以优化。换句话说，每个对象查询不会专注于特定区域。为了解决这些问题，在我们的查询设计中，对象查询基于锚点，其广泛用于基于CNN的检测器。所以每个对象查询都侧重于锚点附近的对象。此外，我们的查询设计可以在一个位置预测多个对象来解决难度：“一个区域，多个对象”。此外，我们设计了一个注意力，可以降低内存成本，同时实现比DETR中的标准注意力相似或更好的性能。由于查询设计和注意力变化，所提出的探测器，我们称之为锚点DETR，可以实现更好的性能，并比DEDR更快地运行10美元\ Times $更少的训练时期。例如，当使用Reset50-DC5功能进行培训50时，它在MSCOCO DataSet上实现44.2 AP。对MSCOCO基准的广泛实验证明了所提出的方法的有效性。代码可用于\ url {https://github.com/megvii-research/anchordetr}。

translated by 谷歌翻译

InfoFair: Information-Theoretic Intersectional Fairness

Jian Kang , Tiankai Xie , Xintao Wu , Ross Maciejewski , Hanghang Tong

分类：机器学习 | (统计)机器学习

2021-05-24

Algorithmic fairness is becoming increasingly important in data mining and machine learning. Among others, a foundational notation is group fairness. The vast majority of the existing works on group fairness, with a few exceptions, primarily focus on debiasing with respect to a single sensitive attribute, despite the fact that the co-existence of multiple sensitive attributes (e.g., gender, race, marital status, etc.) in the real-world is commonplace. As such, methods that can ensure a fair learning outcome with respect to all sensitive attributes of concern simultaneously need to be developed. In this paper, we study the problem of information-theoretic intersectional fairness (InfoFair), where statistical parity, a representative group fairness measure, is guaranteed among demographic groups formed by multiple sensitive attributes of interest. We formulate it as a mutual information minimization problem and propose a generic end-to-end algorithmic framework to solve it. The key idea is to leverage a variational representation of mutual information, which considers the variational distribution between learning outcomes and sensitive attributes, as well as the density ratio between the variational and the original distributions. Our proposed framework is generalizable to many different settings, including other statistical notions of fairness, and could handle any type of learning task equipped with a gradient-based optimizer. Empirical evaluations in the fair classification task on three real-world datasets demonstrate that our proposed framework can effectively debias the classification results with minimal impact to the classification accuracy.

translated by 谷歌翻译

FGraDA: A Dataset and Benchmark for Fine-Grained Domain Adaptation in Machine Translation

Wenhao Zhu , Shujian Huang , Tong Pu , Pingxuan Huang , Xu Zhang , Jian Yu , Wei Chen , Yanfeng Wang , Jiajun Chen

分类：自然语言处理

2020-12-31

以前的研究，将一般神经计算机翻译（NMT）模型调整为特定域通常忽略同一域内的翻译中的分集，这是真实情景中域适应的核心问题。这种具有挑战性的情景的一个代表是部署与特定主题的会议的翻译系统，例如全球变暖或冠状病毒，因为时间表通常存在极低的资源。为了激励在这种情况下更广泛的调查，我们在机器翻译（Flgada）中展示了一个真实的细粒度域适应任务。 Flgada DataSet由汉英翻译任务组成，用于信息技术的四个子域：自治车辆，AI教育，实时网络和智能手机。每个子域都配备有开发集和测试集以进行评估目的。为了更接近现实，Flgada不采用任何域名双语培训数据，但提供双语词典和Wiki知识库，这可以在短时间内更容易获得。我们基准于细粒度的域适应任务，并显示深入的分析，表明存在仍然有挑战性的问题，以进一步提高异构资源的性能。

translated by 谷歌翻译

Rethinking Mobile Block for Efficient Neural Models

Jiangning Zhang , Xiangtai Li , Jian Li , Liang Liu , Zhucun Xue , Boshen Zhang , Zhengkai Jiang , Tianxin Huang , Yabiao Wang , Chengjie Wang

分类：计算机视觉

2023-01-03

This paper focuses on designing efficient models with low parameters and FLOPs for dense predictions. Even though CNN-based lightweight methods have achieved stunning results after years of research, trading-off model accuracy and constrained resources still need further improvements. This work rethinks the essential unity of efficient Inverted Residual Block in MobileNetv2 and effective Transformer in ViT, inductively abstracting a general concept of Meta-Mobile Block, and we argue that the specific instantiation is very important to model performance though sharing the same framework. Motivated by this phenomenon, we deduce a simple yet efficient modern \textbf{I}nverted \textbf{R}esidual \textbf{M}obile \textbf{B}lock (iRMB) for mobile applications, which absorbs CNN-like efficiency to model short-distance dependency and Transformer-like dynamic modeling capability to learn long-distance interactions. Furthermore, we design a ResNet-like 4-phase \textbf{E}fficient \textbf{MO}del (EMO) based only on a series of iRMBs for dense applications. Massive experiments on ImageNet-1K, COCO2017, and ADE20K benchmarks demonstrate the superiority of our EMO over state-of-the-art methods, \eg, our EMO-1M/2M/5M achieve 71.5, 75.1, and 78.4 Top-1 that surpass \textbf{SoTA} CNN-/Transformer-based models, while trading-off the model accuracy and efficiency well.

translated by 谷歌翻译

MGTAB: A Multi-Relational Graph-Based Twitter Account Detection Benchmark

Shuhao Shi , Kai Qiao , Jian Chen , Shuai Yang , Jie Yang , Baojie Song , Linyuan Wang , Bin Yan

分类：计算机视觉

2023-01-03

The development of social media user stance detection and bot detection methods rely heavily on large-scale and high-quality benchmarks. However, in addition to low annotation quality, existing benchmarks generally have incomplete user relationships, suppressing graph-based account detection research. To address these issues, we propose a Multi-Relational Graph-Based Twitter Account Detection Benchmark (MGTAB), the first standardized graph-based benchmark for account detection. To our knowledge, MGTAB was built based on the largest original data in the field, with over 1.55 million users and 130 million tweets. MGTAB contains 10,199 expert-annotated users and 7 types of relationships, ensuring high-quality annotation and diversified relations. In MGTAB, we extracted the 20 user property features with the greatest information gain and user tweet features as the user features. In addition, we performed a thorough evaluation of MGTAB and other public datasets. Our experiments found that graph-based approaches are generally more effective than feature-based approaches and perform better when introducing multiple relations. By analyzing experiment results, we identify effective approaches for account detection and provide potential future research directions in this field. Our benchmark and standardized evaluation procedures are freely available at: https://github.com/GraphDetec/MGTAB.

translated by 谷歌翻译

More is Better: A Database for Spontaneous Micro-Expression with High Frame Rates

Sirui Zhao , Huaying Tang , Xinglong Mao , Shifeng Liu , Hanqing Tao , Hao Wang , Tong Xu , Enhong Chen

分类：计算机视觉

2023-01-03

As one of the most important psychic stress reactions, micro-expressions (MEs), are spontaneous and transient facial expressions that can reveal the genuine emotions of human beings. Thus, recognizing MEs (MER) automatically is becoming increasingly crucial in the field of affective computing, and provides essential technical support in lie detection, psychological analysis and other areas. However, the lack of abundant ME data seriously restricts the development of cutting-edge data-driven MER models. Despite the recent efforts of several spontaneous ME datasets to alleviate this problem, it is still a tiny amount of work. To solve the problem of ME data hunger, we construct a dynamic spontaneous ME dataset with the largest current ME data scale, called DFME (Dynamic Facial Micro-expressions), which includes 7,526 well-labeled ME videos induced by 671 participants and annotated by more than 20 annotators throughout three years. Afterwards, we adopt four classical spatiotemporal feature learning models on DFME to perform MER experiments to objectively verify the validity of DFME dataset. In addition, we explore different solutions to the class imbalance and key-frame sequence sampling problems in dynamic MER respectively on DFME, so as to provide a valuable reference for future research. The comprehensive experimental results show that our DFME dataset can facilitate the research of automatic MER, and provide a new benchmark for MER. DFME will be published via https://mea-lab-421.github.io.

translated by 谷歌翻译